Mixture general diagnostic model

ABSTRACT

Disclosed herein is a method of analyzing examinee item response data comprising constructing a diagnosis model for reporting skill profiles of examinees, wherein the diagnosis model comprises at least a variable representing unobserved subpopulations, creating an item design matrix, distributing examinees across the unobserved subpopulations, iteratively estimating values for a plurality of variables within the diagnosis model, and reporting the estimated values to a user.

REFERENCE TO RELATED APPLICATIONS

This application claims priority to U.S. Application No. 60/824,837,filed Sep. 7, 2006, and entitled “Mixture General Diagnostic Model.”

BACKGROUND OF INVENTION

Item Response Theory (IRT) is a body of theory used in the field ofpsychometrics. In IRT, mathematical models are applied to analyze datafrom tests or questionnaires in order to measure abilities and attitudesstudied in psychometrics. One branch of IRT is diagnostic models.Diagnostic models may be used to provide skill profiles, therebyoffering additional information about the examinees. One central tenetbehind diagnostic models is that different items tap into different setsof skills or examinee attributes and that experts can generate a matrixof relations between items and skills required to solve these items. Adiagnostic model according to the prior art, the General DiagnosticModel (GDM), will hereinafter be described.

In order to define the GDM, several assumptions must first be presented.Assume an I-dimensional categorical random variable {right arrow over(x)}=(x₁, . . . , x₁) with x_(i)ε{0, . . . , m_(i)) for iε{1, . . . ,I}, which may be referred to as a response vector. Further assume thatthere are N independent and identically distributed (i.i.d.)realizations {right arrow over (x)}₁, . . . , {right arrow over (x)}_(N)of this random variable {right arrow over (x)}, so that x_(ni) denotesthe i-th component of the n-th realization {right arrow over (x)}_(n).In addition, assume that there are N unobserved realizations of aK-dimensional categorical variable, {right arrow over (a)}=(a₁, . . . ,a_(k)), so that the vector({right arrow over (x)} _(n) ,{right arrow over (a)} _(n))=(x _(n1) , .. . , x _(n1) ,a _(n1) , . . . , a _(nK))exists for all nε{1, . . . , N} The data structure(X,A)=(({right arrow over (x)} _(n) ,{right arrow over (a)}_(n)))_(n=1, . . . , N)may be referred to as the complete data, and ({right arrow over(x)}_(n))=(({right arrow over (x)}_(n),{right arrow over(a)}_(n)))_(n=1, . . . , N) is referred to as the observed data matrix.Denote ({right arrow over (a)}_(n))_(n=1, . . . , N) as the latent skillor attribute patterns, which is the unobserved target of inference.

Let P({right arrow over (a)})=P({right arrow over (A)}=(a₁, . . . ,a_(K)))>0 for all {right arrow over (a)} denote the nonvanishingdiscrete count density of {right arrow over (a)}. Assume that theconditional discrete count density P(x₁, . . . , x₁|{right arrow over(a)}) exists for all {right arrow over (a)}. Then the probability of aresponse vector {right arrow over (x)} can be written as${P\left( \overset{\rightharpoonup}{x} \right)} = {\sum\limits_{\overset{\rightharpoonup}{a}}\quad{{P\left( \overset{\rightharpoonup}{a} \right)}{P\left( {x_{1\quad},\ldots\quad,{x_{I}❘\overset{\rightharpoonup}{a}}} \right)}}}$

Thus far, no assumptions have been made about the specific form of theconditional distribution of {right arrow over (x)} given {right arrowover (a)}, other than that P(x₁, . . . , x₁|{right arrow over (a)})exists. For the GDM, local independence (LI) of the components {rightarrow over (x)} given {right arrow over (a)} may be assumed, whichyields${P\left( {x_{1\quad},\ldots\quad,{x_{I}❘\overset{\rightharpoonup}{a}}} \right)} = {\sum\limits_{i = 1}^{I}\quad{p_{i}\left( {x = {x_{i}❘\overset{\rightharpoonup}{a}}} \right)}}$so that the probability p_(i) (x=x_(i)|{right arrow over (a)}) is theone component left to be specified to arrive at a model for P({rightarrow over (x)}).

Logistic models have secured a prominent position among models forcategorical data. The GDM may also be specified as a model with alogistic link between an argument, which depends on the random variablesinvolved and some real valued parameters, and the probability of theobserved response.

Using the above definitions, the GDM may be defined as follows. LetQ=(q _(ik)), i=1, . . . , I, k=1, . . . , Kbe a binary I×K matrix, that is q_(ik)ε{0,1}. Let(γ_(ikx)), i=1, . . . , I, k=1, . . . , K, x=1, . . . , m_(i)be a cube of real valued parameters, and let β_(ix) for i=1, . . . , Iand xε{0, . . . , m_(i)) be real valued parameters. Then define${p_{i}\left( {x❘\overset{\rightharpoonup}{a}} \right)} = \frac{\exp\left( {\beta_{ix} + {\sum\limits_{k}{\gamma_{ikx}{h\left( {q_{ik},a_{k}} \right)}}}} \right)}{1 + {\sum\limits_{y = 1}^{m_{i}}\quad{\exp\left( {\beta_{iy} + {\sum\limits_{k}{\gamma_{iky}{h\left( {q_{ik},a_{k}} \right)}}}} \right)}}}$

It may convenient to constrain the γ_(ikx) somewhat an to specify realvalued function h(q_(ik), a_(k)) and the a_(k) in a way that allowsemulation of models frequently used in educational measurements andpsychometrics. It may be convenient to choose h(q_(ik),a_(k))=q_(ik)a_(k), and γ_(ikx=x)γ_(ik).

The GDM has some unfortunate limitations. Primarily, it is not equippedto handle unobserved partitions, or subpopulations, in the examinees.

Thus, there is a need for a diagnostic model that may be extended tohandle unobserved subpopulations.

BRIEF DESCRIPTION OF DRAWINGS

Aspects, features, benefits and advantages of the embodiments of thepresent invention will be apparent with regard to the followingdescription, appended claims and accompanying drawings where:

FIG. 1 is a flow chart illustrating an exemplary method of applyingdiagnosis models to examinee data;

FIG. 2 is a diagram of an exemplary system upon which theabove-referenced method may operate.

SUMMARY OF THE INVENTION

Disclosed herein is a method of analyzing examinee item response datacomprising constructing a diagnosis model for reporting skill profilesof examinees, wherein the diagnosis model comprises at least a variablerepresenting unobserved subpopulations, creating an item design matrix,distributing examinees across the unobserved subpopulations, iterativelyestimating values for a plurality of variables within the diagnosismodel, and reporting the estimated values to a user.

Also disclosed herein is method of analyzing examinee item response datacomprising constructing a diagnosis model for reporting skill profilesof examinees, wherein the diagnosis model comprises at least a variablerepresenting unobserved subpopulations and a cluster variable, creatingan item design matrix, distributing examinees across the unobservedsubpopulations, iteratively estimating values for a plurality ofvariables within the diagnosis model, and reporting the estimated valuesto a user.

DETAILED DESCRIPTION OF THE INVENTION

As mentioned above, the GDM has some limitations. In particular, the GDMdoes not take into account different behaviors among varioussubpopulations. That is, the probability of an observation {right arrowover (x)} may depend not only the unobserved latent trait, {right arrowover (a)}, but also on a subpopulation identifier g. The subpopulationidentifier g may be observed, but is often unobserved. Mixturedistribution models are useful because observations from differentsubpopulations may either differ in their distribution of skills or intheir approach to the test items, or in both. A discrete mixturedistribution in the setup of random variables as introduced aboveincludes an unobserved grouping indicator g_(n) for n=1, . . . , N. Thecomplete data for examinee n then becomes ({right arrow over (x)}_(n),{right arrow over (a)}_(n), g_(n)), of which only {right arrow over(x)}_(n) is observed in mixture distribution models. A mixture GDM, orMGDM, will hereinafter be disclosed.

In order accommodate different groups, the conditional independenceassumption has to be modified, that is${P\left( {{\overset{\rightharpoonup}{x}❘\overset{\rightharpoonup}{a}},g} \right)} = {{P\left( {x_{1\quad},\ldots\quad,{x_{I}❘\overset{\rightharpoonup}{a}},g} \right)} = {\prod\limits_{i = 1}^{I}\quad{p_{i}\left( {{x = {x_{i}❘\overset{\rightharpoonup}{a}}},g} \right)}}}$Moreover, assume that the conditional probability of the componentsx_(i) of {right arrow over (x)} depends on nothing but {right arrow over(a)} and g, that is, $\begin{matrix}{{P\left( {{\overset{\rightharpoonup}{x}❘\overset{\rightharpoonup}{a}},g,z} \right)} = {{\prod\limits_{i = 1}^{I}\quad{p_{i}\left( {{x = {x_{i}❘\overset{\rightharpoonup}{a}}},g} \right)}} = {P\left( {{\overset{\rightharpoonup}{x}❘\overset{\rightharpoonup}{a}},g} \right)}}} & {{Equation}\quad 1}\end{matrix}$for any random variable z. In mixture models, when the g_(n) are notobserved, the marginal probability of a response vector {right arrowover (x)} needs to be found, that is, $\begin{matrix}{{P\left( \overset{\rightharpoonup}{x} \right)} = {\sum\limits_{g}\quad{\pi_{g}\left( {\overset{\rightharpoonup}{x}❘g} \right)}}} & {{Equation}\quad 2}\end{matrix}$where P({right arrow over (x)}|g)=Σ_({right arrow over (a)})p({rightarrow over (a)}|g)P(x|{right arrow over (a)}, g). The π_(g)=P(G=g) maybe referred to as mixing proportions, or class sizes. The class-specificprobability of a response vector {right arrow over (x)} given skillpattern {right arrow over (a)} the MGDM may then be defined as$\begin{matrix}{{P\left( {{\overset{\rightharpoonup}{x}❘\overset{\rightharpoonup}{a}},g} \right)} = {{\prod\limits_{i = 1}^{I}\quad{P\left( {{x_{i}❘\overset{\rightharpoonup}{a}},g} \right)}} = {\prod\limits_{i = 1}^{I}\quad\left\lbrack \frac{\exp\left( {\beta_{ixy} + {\sum\limits_{k}{x_{i}\gamma_{ikg}q_{ik}a_{k}}}} \right)}{1 + {\sum\limits_{y}\quad{\exp\left( {\beta_{iyg} + {\sum\limits_{k}{y\quad\gamma_{ikg}q_{ik}a_{k}}}} \right)}}} \right\rbrack}}} & {{Equation}\quad 3}\end{matrix}$with class-specific item difficulties β_(ixg). The γ_(ikg) are the slopeparameters relating skill k to item i in class g. This equation may beused to model, for instance, both polytomous and binary data.

One special case of the MGDM is a model that assumes measurementinvariance across populations, which is expressed in the equality ofp({right arrow over (x)}|{right arrow over (a)}, g) across groups or,more formallyP(x _(i) |{right arrow over (a)},g)=p(x _(i) |{right arrow over (a)},c)for all iε{1, . . . , I} and all g, cε{1, . . . , G}

Under this assumption, the MGDM equation may be rewritten without thegroup index g in the conditional response probabilities, so that$\begin{matrix}{{P\left( \overset{\rightharpoonup}{x} \right)} = {{\sum\limits_{g}\quad{\pi_{g}{P\left( {\overset{\rightharpoonup}{x}❘g} \right)}}} = {\sum\limits_{g}\quad{\pi_{g}{\sum\limits_{\overset{\rightharpoonup}{a}}\quad{{p\left( {\overset{\rightharpoonup}{a}❘g} \right)}{\prod\limits_{i = 1}^{I}{P\left( {x_{i}❘\overset{\rightharpoonup}{a}} \right)}}}}}}}} & {{Equation}\quad 4}\end{matrix}$Note that the differences between groups are only present in thep({right arrow over (a)}|g), so that the skill distribution is the onlycomponent with a condition on g in the above equation.

The MGDM may be expanded to introduce an additional structure, referredto as a cluster variable. This expanded model may be referred to as aHierarchical GDM, or HGDM. This cluster variable may be used to accountfor correlations in the data. One example for clustered data is theresponses to educational assessments sampled from students withinschools or classrooms. For instance, it seems plausible to assume thatstudents within schools are more similar than students across schools.In addition to the grouping variable g, the hierarchical extension ofthe GDM assumes that each observation n is characterized by an outcomes_(n) on a clustering variable s. The clusters identified by thisoutcome may be schools, classrooms, or other sampling units representingthe hierarchical structure of the data collection. The (unobserved)group membership g, is thought of as an individual classificationvariable; for two examinees n≠m there may be two different groupmemberships, that is, both g_(n)=g_(m) and g_(n)≠g_(m) are bothpermissible even if they belong to the same cluster (i.e., s_(n)=s_(m)).

Moreover, it may be assumed that the skill distribution depends only onthe group indicator g and no other variable, that is,P({right arrow over (a)}|g,z)=P({right arrow over (a)}|g)  Equation 5for any random variable z. More specifically, for the clusteringvariable s,${P(g)} = {\sum\limits_{s = 1}^{S}\quad{{p(s)}{{P\left( {g❘s} \right)}.}}}$With Equation 5,${{P\left( {g❘s} \right)}{P\left( {\overset{\rightharpoonup}{a}❘g} \right)}} = {{{p\left( {g❘s} \right)}{P\left( {{\overset{\rightharpoonup}{a}❘g},s} \right)}} = {P\left( {\overset{\rightharpoonup}{a},{g❘s}} \right)}}$for${P\left( {{\overset{\rightharpoonup}{x}❘g},s} \right)} = {{\sum\limits_{a}\quad{{P\left( {{\overset{\rightharpoonup}{a}❘g},s} \right)}{P\left( {{\overset{\rightharpoonup}{x}❘\overset{\rightharpoonup}{a}},g,s} \right)}}} = {{\sum\limits_{a}\quad{{p\left( {\overset{\rightharpoonup}{a}❘g} \right)}{P\left( {{\overset{\rightharpoonup}{x}❘\overset{\rightharpoonup}{a}},g} \right)}}} = {P\left( {\overset{\rightharpoonup}{x}❘g} \right)}}}$with Equations 1 and 5. Then the marginal distribution of a responsepattern {right arrow over (x)} in the HGDM is given by $\begin{matrix}{{P\left( \overset{\rightharpoonup}{x} \right)} = {\sum\limits_{s}\quad{{p(s)}{\sum\limits_{g}\quad{{P\left( {g❘s} \right)}{\sum\limits_{a}\quad{{P\left( {\overset{\rightharpoonup}{a}❘g} \right)}\left( {P\left( {{\overset{\rightharpoonup}{x}❘\overset{\rightharpoonup}{a}},g} \right)} \right.}}}}}}} & {{Equation}\quad 6}\end{matrix}$where, as above with respect to the MGDM, the p({right arrow over(a)}|g) denote the distribution of the skill patterns in group g, andthe p({right arrow over (x)}|{right arrow over (a)}, g) denote thedistribution of the response vector {right arrow over (x)} conditionalon skill pattern {right arrow over (a)} and group g. An HGDM thatassumes measurement invariance across clusters and across groups isdefined by $\begin{matrix}{{P\left( \overset{\rightharpoonup}{x} \right)} = {\sum\limits_{s}\quad{{p(s)}{\sum\limits_{g}\quad{{P\left( {g❘s} \right)}{\sum\limits_{a}\quad{{P\left( {\overset{\rightharpoonup}{a}❘g} \right)}\left( {P\left( {\overset{\rightharpoonup}{x}❘\overset{\rightharpoonup}{a}} \right)} \right.}}}}}}} & {{Equation}\quad 7}\end{matrix}$with conditional response probabilities p(x|{right arrow over(a)})=Π_(i)p(x_(i)|{right arrow over (a)}) that do not depend on clusteror group variables.

The increase in complexity of HGDMs over, for instance, the MGDM, liesin the fact that the group distribution P(g|s) depends on the clustervariable s. This increases the number of group or class size parametersdepending on the number of clusters # {s: sεS}.

Hereinafter, an exemplary method of applying the above models toexaminee data will be described with respect to FIG. 1. First, data isdefined 110. More specifically, defining data may comprise steps ofdefining item response variables or defining mixture components. Then, askill space is created 120. More specifically, creating a skill spacemay comprise defining the number of skills and defining the assumedskill levels (e.g., how many, which numerical anchor) for each of theassumed skill variables. Then, an item design matrix (e.g., a Q-matrix)is created 130, which may relate the item response variables to theassumed skill variables. Then, data for each examinee may be read 140.For instance, this step may comprise reading item response variables andgrouping variables. Then, the examinees may be randomly distributed 150across groups. Then, initial skill distributions are calculated 160.

Then, the MGDM or HGDM statistics are calculated 170. In a preferredembodiment, this is performed using an expectation-maximization (EM)algorithm such as the one disclosed in “Multilevel latent class models,”by J. K Vermunt, published in Sociological Methodology 33, which isincorporated by reference herein. The EM algorithm will be described inmore detail as it applies to the HGDM. One of ordinary skill willappreciate that this method is easily applicable to the MGDM as well.

Since the data are structured hierarchically, the first step is todefine the complete data for the HGDM. Let S denote the number ofclusters in the sample, and let N_(s) denote the number of examinees incluster s, for s=1, . . . , S. Then, let x_(ins) denote the i-thresponse of the n-th examinee in cluster s and let {right arrow over(x)}_(ns) denote the complete observed response vector of examinee n incluster s. Further, let a_(kns) denote the k-th skill of examinee n incluster s and let {right arrow over (a)}_(ns) denote the skill patternof examinee n in cluster s. Finally, let g_(ns) denote the groupmembership of examinee n in cluster s. Note that only the x_(ins) areobserved, as are the cluster sizes N_(s) and the number of clusters S.The s_(kns) and g_(ns) are unobserved and have to be inferred by makingmodel assumptions and calculating posterior probabilities such as P(g|s)and P({right arrow over (a)}, g|{right arrow over (x)}, s).

For the complete data (i.e., the observed data {right arrow over (x)} inconjunction with the unobserved skill profiles {right arrow over (a)}and group membership g), the marginal likelihood is$L = {\prod\limits_{s = 1}^{S}\quad{\prod\limits_{n = 1}^{N_{s}}\quad{P\left( {{\overset{\rightharpoonup}{x}}_{ns},{\overset{\rightharpoonup}{a}}_{ns},{{\overset{\rightharpoonup}{g}}_{ns};s}} \right)}}}$that is, a sum over cluster-specific distributions of the complete data.With the above assumptions,$L = {\prod\limits_{s = 1}^{S}\quad{\prod\limits_{n = 1}^{N_{s}}\quad{{P\left( {{\overset{\rightharpoonup}{x}}_{ns}❘{{\overset{\rightharpoonup}{a}}_{ns}{\overset{\rightharpoonup}{,g}}_{ns}}} \right)}{p\left( {{\overset{\rightharpoonup}{a}}_{ns}❘{\overset{\rightharpoonup}{g}}_{ns}} \right)}{p\left( {g_{ns}❘s} \right)}}}}$which equals$L = {L_{\overset{\rightharpoonup}{x}} \times L_{a} \times L_{\overset{\rightharpoonup}{g}}}$with${L_{\overset{\rightharpoonup}{x}} \times L_{a} \times L_{\overset{\rightharpoonup}{g}}} = {\left( {\prod\limits_{s = 1}^{S}\quad{\prod\limits_{n = 1}^{N_{s}}\quad{P\left( {{{\overset{\rightharpoonup}{x}}_{ns}❘{\overset{\rightharpoonup}{a}}_{ns}},g_{ns}} \right)}}} \right)\left( {\prod\limits_{s = 1}^{S}\quad{\prod\limits_{n = 1}^{N_{s}}\quad{p\left( {{{\overset{\rightharpoonup}{a}}_{ns}❘{\overset{\rightharpoonup}{a}}_{ns}},g_{ns}} \right)}}} \right)\left( {\prod\limits_{s = 1}^{S}\quad{\prod\limits_{n = 1}^{N_{s}}\quad{p\left( {g_{ns}❘s} \right)}}} \right)}$Note that these components may be rearranged and rewritten as$L_{\overset{\rightharpoonup}{x}} = {{\prod\limits_{s = 1}^{S}\quad{\prod\limits_{n = 1}^{N_{s}}{\prod\limits_{i = 1}^{I}{P\left( {{x_{ns}❘{\overset{\rightharpoonup}{a}}_{ns}},g_{ns}} \right)}}}} = {\prod\limits_{g}{\prod\limits_{\overset{\rightharpoonup}{a}}{\prod\limits_{i}{\prod\limits_{x}{P\left( {{X_{i\quad} = {x❘\overset{\rightharpoonup}{a}}},g} \right)}^{n_{i}{({x,\overset{\rightharpoonup}{a},g})}}}}}}}$with n(x_(i), i, {right arrow over (a)}, g)=Σ_(s)n(x_(i), i, {rightarrow over (a)}, g, s) is the frequency of category x_(i) responses onitem I for examinees with skill pattern {right arrow over (a)} in groupg. Also,$L_{g} = {{\prod\limits_{s = 1}^{S}\quad{\prod\limits_{n = 1}^{N_{s}}{p\left( {{\overset{\rightharpoonup}{a}}_{ns}❘g_{ns}} \right)}}} = {\prod\limits_{\overset{\rightharpoonup}{a}}{\prod\limits_{g}{p\left( {\overset{\rightharpoonup}{a}❘g} \right)}^{n{({\overset{\rightharpoonup}{a};g})}}}}}$where n({right arrow over (a)}; g) is the frequency of skill pattern{right arrow over (a)} in group g. Finally,$L_{g} = {{\prod\limits_{s = 1}^{S}\quad{\prod\limits_{n = 1}^{N_{s}}{p\left( {g_{ns}❘s} \right)}}} = {\prod\limits_{s}{\prod\limits_{g}{p\left( {g❘s} \right)}^{n{({g;s})}}}}}$holds. The n(g; s) represents the frequency of group membership in g incluster S.

The EM algorithm cycles through the generation of expected values andthe maximization of parameters given these preliminary expectationsuntil convergence is reached. This process is well known in the art andis described in, for instance, The EM-algorithm and extensions byMcLachlan et al., published by Wiley, which is incorporated by referenceherein. For the HGDM, there are three different types of expected valuesto be generated in the E-step: First, {circumflex over (n)}_(i)(x,{right arrow over (a)}, g)=Σ_(s)Σ_(n)1{x_(ins)=s}P({right arrow over(a)}, g|{right arrow over (x)}_(ns), s) is the expected frequency ofresponse x to item I for examinees with skill pattern {right arrow over(a)} in group g, estimated across clusters and across examinees withinclusters. Second, {circumflex over (n)}({right arrow over (a)},g)=Σ_(s)Σ_(n)P({right arrow over (a)}, g|{right arrow over (x)}_(ns), s)is the expected frequency of skill pattern {right arrow over (a)} andgroup g, estimated across clusters and across examinees within clusters.Finally, {circumflex over (n)}(g,s)=Σ_(n)P(g|{right arrow over(x)}_(ns), s) is the expected frequency of group g in cluster s,estimated across examinees in that cluster. For the first and secondtype of the required expected counts, this involves estimating${P\left( {\overset{\rightharpoonup}{a},{g❘\overset{\rightharpoonup}{x}},s} \right)} = {\frac{P\left( {\overset{\rightharpoonup}{x},s,\overset{\rightharpoonup}{a},g} \right)}{\sum\limits_{g}\quad{P\left( {\overset{\rightharpoonup}{x},s,g} \right)}} = \frac{{P\left( {{\overset{\rightharpoonup}{x}❘\overset{\rightharpoonup}{a}},g} \right)}{p\left( {\overset{\rightharpoonup}{a}❘g} \right)}{p\left( {g❘s} \right)}}{\sum\limits_{g}\quad{P\left( {\overset{\rightharpoonup}{x},s,g} \right)}}}$for each response pattern {right arrow over (x)}_(ns), for s=1, . . . ,S and n=1, . . . , N_(s). For the third type of expected count, use${p\left( {{g❘\overset{\rightharpoonup}{x}},s} \right)} = {\sum\limits_{a}{P\left( {\overset{\rightharpoonup}{a},{g❘\overset{\rightharpoonup}{x}},s} \right)}}$which is equivalent to${p\left( {{g❘\overset{->}{x}},s} \right)} = {\frac{P\left( {\overset{->}{x},s,g} \right)}{\sum\limits_{g}{P\left( {\overset{->}{x},s,g} \right)}} = \frac{\sum\limits_{\overset{->}{a}}{{P\left( {{\overset{->}{x}❘\overset{->}{a}},g} \right)}{p\left( {\overset{->}{a}❘g} \right)}{p\left( {g❘s} \right)}}}{\sum\limits_{g}\left\lbrack {\sum\limits_{\overset{->}{a}}{{P\left( {{\overset{->}{x}❘\overset{->}{a}},g} \right)}{p\left( {\overset{->}{a}❘g} \right)}{p\left( {g❘s} \right)}}} \right\rbrack}}$This last probability then allows one to estimate the class membership ggiven both the observed responses {right arrow over (x)} and the knowncluster membership s. The utility of the clustering variable may beevaluated in terms of increase of the maximum a postereori probabilitiesp(g|{right arrow over (x)}, s) over p(g|{right arrow over (x)}). If theclustering variable s is informative for the classification g, anoticeable increase of the maximum posterior probabilities should beobserved. The improvement should also be seen in terms of the marginallog-likelihood if s is informative for g.

Referring back to FIG. 1, once the statistics have converged, model fitmeasures, log-likelihoods, and related measures may be calculated 180.Then, item fit measures are calculated 185. Finally, person-basedoutcome statistics may be calculated 190. Calculating person-basedoutcome statistics may include calculating most probable groupmembership and most probable skill level for each skill for eachexaminee. These statistics, item-fit measures, log-likelihoods, andmodel fit measures may be presented to a user in a human-readablefashion, such as a computer display, or printout.

FIG. 2 is a block diagram of exemplary internal hardware that may beused to contain or implement the program instructions of a systemembodiment. Referring to FIG. 2, a bus 228 serves as the maininformation highway interconnecting the other illustrated components ofthe hardware. CPU 202 is the central processing unit of the system,performing calculations and logic operations required to execute aprogram. Read only memory (ROM) 218 and random access memory (RAM) 220constitute exemplary memory devices.

A disk controller 204 interfaces with one or more optional disk drivesto the system bus 228. These disk drives may be external or internalfloppy disk drives such as 210, CD ROM drives 206, or external orinternal hard drives 208. As indicated previously, these various diskdrives and disk controllers are optional devices.

Program instructions may be stored in the ROM 218 and/or the RAM 220.Optionally, program instructions may be stored on a computer readablemedium such as a floppy disk or a digital disk or other recordingmedium, a communications signal or a carrier wave.

An optional display interface 222 may permit information from the bus228 to be displayed on the display 224 in audio, graphic or alphanumericformat. Communication with external devices may optionally occur usingvarious communication ports 226. An exemplary communication port 226 maybe attached to a communications network, such as the Internet or anintranet.

In addition to the standard computer-type components, the hardware mayalso include an interface 212 which allows for receipt of data frominput devices such as a keyboard 214 or other input device 216 such as aremote control, pointer and/or joystick.

The diagnostic models described herein may be used in connection with,for instance and without limitation, English language testing, nationallarge scale assessments, international assessments, or K-12accountability testing. For instance, the MGDM and HGDM may be used inconnection with Test of Engliesh as a Foreign Language (TOEFL) results.

While illustrative embodiments of the invention have been shown herein,it will be apparent to those skilled in the art that the invention maybe embodied still otherwise without departing from the spirit and scopeof the claimed invention.

1. A method of analyzing examinee item response data comprising:constructing a diagnosis model for reporting skill profiles ofexaminees, wherein the diagnosis model comprises at least a variablerepresenting unobserved subpopulations; creating an item design matrix;distributing examinees across the unobserved subpopulations; iterativelyestimating values for a plurality of variables within the diagnosismodel; and reporting the estimated values to a user.
 2. The method ofclaim 1, wherein the examinees are randomly distributed across theunobserved subpopulations.
 3. The method of claim 1, wherein theexaminee item response data is polytomous.
 4. The method of claim 1,wherein the examinee item response data is binary.
 5. The method ofclaim 1, wherein the plurality of variables comprise variablesrepresenting both item difficulties and slope parameters relating askill to an item.
 6. The method of claim 1, further comprising: definingitem response variables; defining a number of skills; defining theassumed skill levels; and calculating initial skill distributions. 7.The method of claim 1, wherein measurement invariance acrosssubpopulations is assumed.
 8. A method of analyzing examinee itemresponse data comprising: constructing a diagnosis model for reportingskill profiles of examinees, wherein the diagnosis model comprises atleast a variable representing unobserved subpopulations and a clustervariable; creating an item design matrix; distributing examinees acrossthe unobserved subpopulations; iteratively estimating values for aplurality of variables within the diagnosis model; and reporting theestimated values to a user.
 9. The method of claim 8, wherein theexaminees are randomly distributed across the unobserved subpopulations.10. The method of claim 8, wherein the examinee item response data ispolytomous.
 11. The method of claim 8, wherein the examinee itemresponse data is binary.
 12. The method of claim 8, wherein theplurality of variables comprise variables representing both itemdifficulties and slope parameters relating a skill to an item.
 13. Themethod of claim 8, further comprising: defining item response variables;defining a number of skills; defining the assumed skill levels; andcalculating initial skill distributions.
 14. The method of claim 8,wherein measurement invariance across subpopulations is assumed.